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Data mining is a process of mining hidden information to the previously 
unknown data and theoretically useful unknown information from a large 
amount of genuine data to be stored in a database. Image mining is a part of 
data mining with used as a predictive measure to identify with the age of the 
tiger. This research work is mainly focused on, to identify with the age of the 
Tiger using data mining techniques. This research work incorporates with 
which those domains of image processing and data mining to predict the age 
of the tiger using different kinds of color images are used. The fuzzy iterative 
self-organizing data analysis (FISODATA) clustering method requires more 
predefined parameters tofind the maximum number of iterations, the 
minimum number of points in the cluster, and smallest amount of distance 
with the centers of the clusters. The key undertaking of the studies of diverse 
colors mechanism is to decide the age of the tiger; the usage of shade action 
pixel primarily based on image segmentation; the usage of facts that are used 
in the mining techniques. However, the more matrix components to be 


measuring the processing time, retrieval time, accuracy, and blunders fee with 
the aid of using producing better performance. 
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1. INTRODUCTION 

Excavating is the process of extracting unknown information from previously unknown data and 
hypothetically significant knowledge from a large amount of concrete data to be stored in a database by 
Mahmud ef al. [1]. Many data mining techniques are available, such as clustering, classification, association, 
regression and evaluation by Chakrabortya et al. [2]. To get access from this scenario, too many strategies are 
demonstrated, such as Pattern recognition, time series, OLAP, visualization, and other techniques are all 
significant by Kumudham and Rajendran [3]. Advances in image acquisition and storage technologies have led 
to a tremendous growth in very broad and informative image databases. Analysis of images will reveal useful 
information to the human users by Caponetti et al. [4]. Image mining deals with the ancestry of inferable 
acquaintance to the image data relationships that are different trims not explicitly embedded in the image 
database stored in the images by Sudana et al. [5]. ISODATA is defined as iterative self-organizing data 
analysis technique. ISODATA is an unsupervised arrangement technique. There has been no way to make a 
decision for how many clusters there are. Fuzzy control is based on fuzzy logic, which is a logical system that 
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is far closer to scientific rationality with including the natural language in the meaning than conventional logical 
systems that has to be used by Dubey et al. 2018 [6]. The fuzzy logic controller (FLC) uses fuzzy logic to 
transform a linguistic control strategy based on expert knowledge into a fuzzy logic controller (FLC). 
ISODATA Algorithm enables the measurement of bunches to be balanced naturally amid the emphasis by 
Ramaraj and Niraimathi [7]. The ISODATA clustering algorithm is based on the ISODATA clustering 
algorithm to transform the modified data clustering algorithm for fuzzy based modified iterative self organizing 
data analysis technique (FMISODATA) clustering algorithm, that is used to support the tiger image database 
proposed by Adeh and Dehnavi [8]. The main objective of this method is to predict the age of the tiger. 


2. REVIEW OF LITERATURE 

The entirely unconventional works to be carried out that the realm of the image process, as image 
segmentation has revealed victimization some different approaches and lots of those works square measure 
focused on the assorted technologies of image segmentation. The existing clustering algorithm becomes one 
of the most fundamental clustering algorithms, with many different implementations that are differentiated in 
the method used to be activated. ISODATA technique was announced by Ball and Hall, and others in the 1960s. 
Ramaraj and Niraimathi [9], has proposed the algorithm towards assessing the well initial centroids are 
generated based on the optimization approach. The proposed clustering algorithm generates the highly accurate 
clusters while reducing computational time. 

Alkhalid [10] has described the image segmentation approach based on image pixel categorization 
was introduced by using for quality control implementation of proposed clustering algorithm and matrix to be 
predicted and calculate with clustering process to be used with given dataset. Ramaraj and Niraimathi [11] to 
presented with the clustering techniques in color image segmentation have managed to five clustering techniques 
as K-means, ISODATA, mean shift, splitting and merging techniques for use in the color image segmentation are 
presented. Gautam and Singhai [12] an automatic detection of route rumble strips, which are important for many 
applications, including lane level navigation and lane departure warning, has been introduced. Khan et al. [13], had 
proposed a new spectral-spatial classification scheme for hyper spectral images. 

The optimizing of the techniques is to incorporating the performance of image classification and 
grouping as well as the segmentation of map produced by region-based segmentation to the number of clusters 
into the different classifiers [14], had proposed an advanced fuzzy based iterative self organizing data analysis 
technique (AF- ISODATA) clustering algorithm for applying on color isolated sensing image segmentation. 
Yang et al. [15], described a color style transfer by constraint locally linear embedding. Abbas et al. [16], had 
presented with a state based modified modified expectation maximization (MEM) algorithm for region image 
segmentation. The proposed method will use and decrease the number of iterations for the segmented image to 
converge rapidly and center at a low time. 

Wang and Wang [17], a new approach based on an unsupervised image segmentation algorithm 
clustering technique will be introduced that determines the best clustering of an image data set with less user 
intervention. Ramaraj and Niraimathi [18], it can be recognized that segmentation is individually dependent on 
either pixel-based or texture-based optimization algorithms and does not contribute to the classification of 
remote sensing images with high spatial resolution since it includes textured and non-textured regions [19]. 

Dhanachandra et al. [20], had analyzed that the presentation of unsupervised classification algorithms 
is called as ISODATA and to test statistically by iterative approaches to automatically group pixels with 
identical spectral characteristics into unique clusters, K-means in remote sensing. Dhanachandra and Chanu 
[21], has presented with the fast and efficient method for color image segmentation [22]. In addition, the 
computing time has been drastically reduced, allowing extremely large images to be processed in a reasonable 
time [23]. 


3. METHOD 

The fuzzy based iterative self organizing data analysis technique (FBISODATA) clustering 
algorithm's unsupervised classification calculates the class, which means that it is presumed to be uniform in 
the data liberty. Then, using smallest amount reserve functions or techniques, the remaining pixels are clustered 
iteratively. Each iteration to be followed in the relationship to the new properties, recalculates means and 
reclassifies pixels [24]. The FBISODATA clustering algorithms are divide into two iterative classes such as 
splitting and merging is done based on the input threshold parameter. Because, if a standard deviation or distance 
threshold is to be established with the all color pixels that are grouped into the nearest color classes. When some 
color pixels may be classified into other nearest cluster groups [25]. However, if they do cannot an implementation 
of threshold value to be found [26]. The procedure was repeated until the number of pixels in each class falls 
below a certain threshold, or until the maximum number of iterations for the selected two pixels has been obtained. 
FBISODATA clustering uses two-parameter sets, the first parameter sets do not change during the clustering 
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process [27]. Another parameter that which can be interactively adjusted until an acceptable clustering result is to 
be obtained. Found the unique attribute indicators matrix U* that illustrates each attribute of any object within 
investigation and reference samples Uj;is on behalf of the characteristic indicators j of object i. homogenize data 
of the unique characteristic and that indicators matrix is U * and by assortment of process to be get U, and 
describable equation as followed by Mj = max (Uj; , U3; ,..,Unj ),mj = min (Uz; U3; ,..., Un; )for column 
j of U *, estimate uj using formula: 


() 


start an incremental process to be based on the unique core matrix of the cluster Vof reference sample system 


compute fuzzy confidential matrix 7 


jj using formula as: 


iO = be (lice 0) 


1Uj.-V jl 


therefore, c defines the number of cluster categories. Then amend bunch core matrix for r”, 


. Here, (3) 


van = (yy, v9, v.)?, Repeat step 2), when evaluate the given matrix is r ana r°for a given 
precision ¢ > 0, if max{r? - ee é , iterative operation should be stopped and r“*) V “*) should be 


outputted. In parallel stipulation as followed by the equation as i = i+ 1, and repeat step 3). Obtain a fuzzy 
set bunch using the better nearest cluster matrix segregation of the basic concept to the better cluster center 
matrix,V* = (V,*,V2",...V.")", VuxEU, object Ux should be confidential to class i. 

The Figure | is illustrated with the modified ISODATA clustering algorithm with fuzzy logic method 
applied for the real time image. Once the tiger image database is loaded then create the original characteristic 
indicators matrix U* that descript color feature value of all examined tiger image object and locus samples of 
tiger images is on behalf of the color characteristic pointers j of tiger object i. Normalize the original color 
value of the tigers to fall under the specified range which is denoted as matrix U* to get U. Then begin the 
process of iterative which frames the cluster center matrix Vof the reference sample system of tiger images. 
Then compute the fuzzy classified matrix with a reference sample of tiger and new incoming image of the tiger. 
Modify the cluster center matrix for R®, depending on the new arrival of the tiger to re-cluster them by 
determining the optimal cluster center. Finally, the fuzzy ISODATA cluster performs optimal cluster center 
matrix discrimination principle in which the tiger’s with same age group are clustered with the color 
characteristics. 


Load the image database 
Initialize the input parameter as indicate U* and object reference samples U;; 
Set the data points and calculate the min and max value of Uj, 
Initialize the cluster center V‘°’ and reference samples of R“? 
Modify the cluster matrix of R:; , 
Compare the value of U;, and RY 
To find the precision and recall value of ¢ 


Update the cluster center V* 


End 
La 


Figure 1. Proposed FBISODATA clustering algorithm 
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Illustrate on the Figure 2 is shows that the architecture diagram for the predictable clustering method 
as FBISODATA. This process is used on two-stage, one is split and merge function. The first process is used 
on split function is called classify the pixels of an image, and another function is merge is called cluster the 
pixels of the image to retrieve the age of the tiger image from the image database. To enhance the segmentation 
accuracy and loyalty, and collective characteristics to be described the over-segmentation regions. These 
features incorporate both color-textures in sequence with the intention effectiveness values in the colors. 


Image database 


Feature extraction 
Database image features 


Feature extraction 


Query image features 


Calculating the age 
FBISODATA 


Irrelevant image 


Relevant image 


Figure 2. Architecture of the proposed FBISODATA clustering algorithm 


3.1. Splitting algorithm 
The algorithms for splitting and merging are segment the image into a particular region. The basic 
framework of representation is pyramidal. The algorithm generally starts from the initial assumption that a 
single region is the whole image, and then computes the criterion of homogeneity. 
— Initialize the k centroid value. 
— Assign the splitting function of the membership process. 
— Search an entire color in the image line by line expect first to last line. 
— Find the pattern of each color and split into m*n. 
— Calculate the fuzzy classifier. Following the (2). 
— Ifa mismatch between assigned label value r® and r+), 
— Assign labels to unsigned pixels in the block. 
— Remove small regions if necessary. 


3.2. Merging algorithm 

For hierarchical segmentation, reliable regions are merging and this performance is very effective. 
Based on the color-texture improves and artifacts of the image, the correspondence dimension of regions and 
consequent stopping criterion are anticipated. The process of merging starts with the image's primitive color 
pixels before the termination criterion is reached and the segmentation is finished [18]. 

The above pixel class transforms right into a histogram characteristic to incorporate color records and 
nearby color distribution function of the pixels that show in the Figure 3. As high-degree visible records, is the 
object's fee or chance of a vicinity belonging to an identifiable object. The maximum plant-based approach to 
neighborhood fusion is to begin the boom with the inside unprocessed data, each color component representing 
a multiple color neighborhood. These regions almost certainly do not satisfy the condition H(R; U Rj). 
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Fie each N node in G, Ri, is neighbor region of N 2 
if dis H(R; UR;) <T then 
Merge the all regions satisfying the merging criterion. 
Change the region of label value R *) 
Modify the cluster object R © 
Update the similarity function G 
end if 
if no regions are satisfied 


End the merge process, otherwise, go to previews process is begin. 


End if. 


<n 


Figure 3. Merging algorithm 


3.3. Neighbours of pixel 

A pixel p has four horizontal and vertical neighbors coordinate as (x, y), and this synchronize are 
givenby (x + 1,y), (x -1,y), y+ 1), y+ 1), y-— 1). (x, y — 1). This pixel set, referred to as N's 4- 
neighbors, and is N 4 Denoted (p). Each pixel is the distance of a unit from (x, y), and some of N's neighbors 
be positioned. If (x, y) is on the boundary of the file, it is outside the digital image. N's four diagonal neighbors 
have coordinated (x + 1,y+1),(«+1,y—1),(«—-1,y+1),( —1,y—1) and are denoted by N p (p). 
The value m reins the amount of clustering with core clustering at m=/ and more and more fuzzy clustering at 


largest amount of m, V is the set of c-cluster centers and a is the fuzzy separation of the image [25]. 


3.4. Finding the nearest color 

This part of the study can explain the method of preventing an image's and number of colors by 
identifying the closest match to an image's available color. This object, just for simplicity, implies it will 
operate with a pre-defined image spectrum assigned to several colors such as, RGB colors and other 
combination of RGB colors. By analyzing the relationship between the separate RGB values of the actual color 
and each of the colors available from the palette, the Euclidean distance is one of the best methods for finding 
the distance. A simple way to ensure that negative and positive values are adapted together to create the distance 
is to square the differences. The nearest color might be the one that has the maximum distance from the actual 
color. When applied the rule is based on the color classification of the RGB pixel. The first stage of this obvious 
process is to load an image. For example, it will use the original tiger images which are standard images to test 
the different image processing techniques. The second stage sequentially takes each color pixel of the image 
and replaces it with the color that most closely matches the available spectrum of the image. Then update or 
replace the position of the pixels and find out the correct color pixel of an image of the particular position and 
it stores the values of the RGB pixel in the database. Euclidean distance is one of the best practices to find 
the distance by calculating the individual RGB values of the actual colors and the difference between each 
color available in the palette. Then square the difference, make sure there are negative and positive values, and 
sum them up to get the distance. 


4. RESULTS AND DISCUSSION 

The tiger image database is included with the proposed model, and it facilitates the execution of 
the MATLAB tool. This database contains over the 500+ above camera trap images from different formats and 
sizes. There will be only one class, which encompasses the several age collections of tiger has been illustrations. 
The proposed method's retrieval accuracy would be assessed in a specific class using a different age group 
category. The proposed clustering methods ways to perform the square measures are used on the color performs 
for an extract to get values to the vector in RGB is concentrated on a virtual machine, and the formula for a 
similarity metric is used to measure the greatest distance. The accuracy, recall, and F-measure is used to 
determine performance while retrieving images from the image database by generation. 
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4.1. Computational complexity 

The computational intricacy of poles apart was cluster technique, when assessed to determine their 
virtual effectiveness, in the terms of time erudition analysis. Clustering with the fuzzy based ISODATA 
clustering algorithm requires better steps than other clustering techniques i.e.0 (.). The ability of hierarchy in 
clustering methods was interpreted as computational convolution equation as given. 


o((N — YPEo" N,)?) (4) 


Hence, N denotes the whole amount of color pixel, m stands for the extent of cluster, and r is number of iteration 
on ¢. The enhancement with the computational complexity of optimized modified MC algorithm is able to a 
vital consequence of extracting the clusters from the dataset, by separating themselves from the new-fangled 
tiger image, and thereby obtaining to reduced computational time for each successive region. 

The grid plot of each database is based on the procedure values and the sample values denoted 
from k = 2,k = 4,and k = 6 clusters that are using the enhanced clustering algorithm to be executed. In the 
tiger image database is shown in Figure 4. The formation of k = 2,4, and 6 clusters is shown on dissimilar 
colors in Figure 4 when every cluster is plotted in individualcolor patterns. 


Figure 4. Illustrate the Plot diagram on FBIC in k = 2,k = 4,and k = 6 cluster for tiger image database 


4.2. Age prediction of the real time tiger image 

That is the main distinction amid at both the actual value and the stand for value of the fundamental 
mechanism that generates the data is the accuracy of the proposed method. The number of appropriatelyto 
segment the current pixels is represented by the cells in either diagonal of the error matrices of (T;;). The unit 
of measurement for overall segmentation precision can be generated from those kind of pixel value by 
measuring, and how many pixels in the tiger image database and the ground were classified as the same age 
(X'T;;), by separating this values on total number of pixels (VN = )R; = 2 C;). The following equation is 
given [2]. 


XTi; 
a=- 
N 


(5) 


Where: ); T;;becomes the whole amount of incidence was appropriately recognized, and N denotes the whole 
number of pixels in the error matrix. Fabricator performance has become a term referring to reliability that is 
widely used to measures and evaluate the percentage of correct predictions for a unit of pixels. 
Tij 
Ri 


A= (6) 
Where T;;denote the numeral of aptly classified pixels in row j, R,denotes the overall pixels in row j. The 


candidate truthfulnessof the intrigue foundation accuracy that is calculated by analyzing a class's reference data 
and calculating the percentage of corrected predictions for this sample. 


Tij 
a= 7) 


Where T;;denotes the number of appropriatelyconfidential pixels in column i, Rjhas denotes the total number 
of pixels in column j. 
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As aresult, ethics of the thresholding integrity of color pixel based on image pixel classification with 
image pixel clustering and it is based on predicting the age of the tiger were used to envisage the Tiger's Age. 
These estimates are evaluated by using tiger image databases. Training datasets and compared to real-time 
camera trap image databases of a tiger in the wildlife forest. 


N pa j=1, kat Tijk- Diet j=1,k=1 Ri -Cj 
d= EEE (9) 


2_-ym : a 
N?— Vita, jat,ko1 Ri -Cj 


Therefore, d is denoted on basic Euclidean distance, N is evaluated for the total number of pixels in an image, 
m is a number of RGB classes. Here, )T; jn the total number of properlyclassify pixels in a tiger image. 


Moreover, when selecting the fastidious age of the tiger image, the threshold value of each color pixel was set 
to a specific tiger.R; , C;is represents with the number of pixels in the row and column. 

According to the Table 1, data can be categorized into Age wise. The amount of clusters is uniformly 
set at three clusters. The highest precision is 96%, and the less precision is 91% in the first year. The recall is 
96% for the highest and 91% for the less. The highest f-measure registered is 95.5%, while the lowest is 93%. 
Then all the measures are compared with each individual comparison procedures in the table is faintly excited. 
Euclidean distance measures do have the utmost precision level is 96% and recall 96%, and the maximum f- 
measure is 95.5%, whereas city block distance measures have the lowest. 


Table 1. Applied fuzzy based iterative self organizing data analysis clustering technique (FBIC) with various 
similarity measures in one year tiger image 
Age SM P RC FM NC 

1 Year City Block 0.91 0.96 0.935 

ChebyChev 0.95 0.91 0.93 

Euclidean 0.96 0.95 0.955 

Minkowski 0.92 0.94 0.93 
Note: SM-Similarity Measures, P-Precision, RC-Recall, 

FM-F-Measures, NC-Number of Clusters 


3 
3 
3 
E] 


Here, the Table | demonstrates that to envisage the age of a tiger image. While evaluating similarity- 
based clustering accuracy and determining the distinction functions such as a city block, Chebychev distance, 
Minkowski distance, and Euclidean distance using clustering metrics including precision, recall, and f- 
measure. The figure portrays the experimental effects that are shown in Figure 5. 


City Block ChebyChev Euclidean Minkowski 


1 Year 
Measures 


Figure 5. Demonstrate the various similarity metrics are used on FBIC with a sample one-year tiger image 


According to the Table 2, data can be categorized into age wise. The amount of clusters is equally set 
at three clusters. The highest precision is 96%, and the lowly precision is 94% in the 2nd year. The recall is 
97% for the highest and 94% for the lowest. The highest f-measure registered is 96.5%, while the lowest is 
93%. Then all the procedures were evaluated in each individual comparison procedures in the table is faintly 
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excited. Euclidean distance measures do have the highest precision 96% and recall 97%, and the highest f- 
measure is 96.5%, whereas city block distance measures have the lowest. 

Table 2 identifies the full participation of Tiger images, including city blocks, Chebychev distances, 
Minkowski distances, and Euclidean distances, and shows that similarity-based clustering can also be used to 
correctly predict the age of Tiger images. Analyze accuracy using clustering indicators Fit rate, recall, and F 
value. Figure 6 shows the experimental effect. 


Table 2. Applied FBIC with different similarity measures using two year tiger image 


Age SM P RC FM NC 
2 Year City Block 0.94 0.95 0.945 3 
ChebyChev 0.95 0.94 0.945 3 
Euclidean 0.96 0.97 0.965 3 
Minkowski 0.95 0.96 0.955 3 


Note: SM-Similarity Measures, P-Precision, RC-Recall, 
FM-F-Measures, NC-Number of Clusters 


City Block 


ChebyChev Euclidean 


2 Year 
Measurements 


Minkowski 


Figure 6. Illustrate on the verious kinds of metrics applied on FBIC is tested with two year tiger image 


According to the Table 3, data can be categorized into age group wise. The number of clusters 
uniformly set at three clusters. The highest precision is 96%, and the lowest precision is 91% in the 15th year. 
The recall is 97% for the highest and 90 for the lowest. The highest f-measure registered is 96%, while the 
lowest is 92%. While lowest value is 93%. Then all the actions are measure up to each individual similarity 
procedures in the table is faintly excited. Chebyche distance measures do have the highest precision 96% and 
Euclidean distance of recall value is 97%, and the highest f-measure is 96.5%, whereas Minkowski distance 
measures have the lowest. 


Table 3. FBI clustering algorithm used with different clustering similarity approaches applied on 15" age of 
the tiger image 


Age SM P RC FM NC 
15 Year City Block 0.91 0.96 0.935 3 
ChebyChev 0.96 0.95 0.955 3 
Euclidean 0.95 0.97 0.96 3 
Minkowski 0.94 0.90 0.92 3 


Note: SM-Similarity Measures, P-Precision, RC-Recall, 
FM-F-Measures, NC-Number of Clusters 


Table 3 measures similarity-based clustering accuracy and uses clustering indicators such as precision, 
recall, and F-measure to generate similarity functions such as city block, Chebychev distance, Minkowski 
distance, and Euclidean distance. These are indicated by identification and correctly predict the age of the tiger 
image. Figure 7 shows the experimental effect. 

Table 4 improves the consistency of each established and improved clustering matrix, including root 
mean square error (RMSE) values, predictable time, and image investigate time. Whenever the proposed 
algorithm used and produced the better results is much more accurate and effective, the clustering results are 
displayed in a graphical format. The results of the proposed method have the highest accuracy rating in fuzzy 
based ISODATA clustering. Figure 8 shows the results of the accuracy, RMSE, time and image search 
performance evaluations in the Tiger image database compared to the proposed and existing accuracy, RMSE, 
time and image search methods. The proposed method is shown in Table 4. 
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City Block ChebyChev Euclidean Minkowski 


15 Year 
Measures 


Figure 7. Illustrate on FBI clustering algorithm used with different clustering similarity methods applied on 
15" age of the tiger image 


Table 4. Comparison of proposed clustering metrics 
Accuracy RMSE Time Image retrieval 
Existing 87.77 Existing 0.646889 Existing 2.6 Existing 2.51 
Proposed _93.36667 Proposed _0.515467 Proposed _2.0188889 Proposed _1.86 


0/515467 
0;646889 


¥ 
Measurements “© 


Figure 8. Comparison of overall performance measures 


5. CONCLUSION 

The primary aspiration of the progression is to acquire for granted the age of the tiger is based on the 
image databases. This research work is generally a focal point on the anticipated method that has composed 
the auxiliary of 500+ real-time tiger images is calm in the wildlife forest. The various types of images of adult 
tigers were truly tested. Colors are being used to separate the image. Clustering is accomplished with the 
different age group tiger images of various ages because of various colors and skin tones and stripes. It is 
also divided into several parts focused on the tiger's age and color. Each image is characterized based on its 
age and color differences. Mostly, in the age prediction of tigers based on the color of the image of tigers, fuzzy 
clustering models mentioned in the following sections are included. True image tests demonstrated that the 
proposed method is effective to the stipulations of exactness and execution time when those are compared to 
recent effective in elevated appearance to the new statistical approach was processed. The product of the 
clustering is very efficient and effective and is conversed in the consequences sector. 
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